Association between dichotomized VASARI feature and overall survival in glioblastoma patients: a single-institution propensity score matching analysis

Objectives This study aimed to investigate the intra- and inter-observer consistency of the Visually Accessible Rembrandt Images (VASARI) feature set before and after dichotomization, and the association between dichotomous VASARI features and the overall survival (OS) in glioblastoma (GBM) patients. Methods This retrospective study included 351 patients with pathologically confirmed IDH1 wild-type GBM between January 2016 and June 2022. Firstly, VASARI features were assessed by four radiologists with varying levels of experience before and after dichotomization. Cohen’s kappa coefficient (κ) was calculated to measure the intra- and inter-observer consistency. Then, after adjustment for confounders using propensity score matching, Kaplan-Meier curves were used to compare OS differences for each dichotomous VASARI feature. Next, patients were randomly stratified into a training set (n = 211) and a test set (n = 140) in a 3:2 ratio. Based on the training set, Cox proportional hazards regression analysis was adopted to develop combined and clinical models to predict OS, and the performance of the models was evaluated with the test set. Results Eleven VASARI features with κ value of 0.61–0.8 demonstrated almost perfect agreement after dichotomization, with the range of κ values across all readers being 0.874–1.000. Seven VASARI features were correlated with GBM patient OS. For OS prediction, the combined model outperformed the clinical model in both training set (C-index, 0.762 vs. 0.723) and test set (C-index, 0.812 vs. 0.702). Conclusion The dichotomous VASARI features exhibited excellent inter- and intra-observer consistency. The combined model outperformed the clinical model for OS prediction. Supplementary Information The online version contains supplementary material available at 10.1186/s40644-024-00754-z.


Introduction
Glioblastoma (GBM) is the most common primary malignant brain tumor [1].The current standard of care for GBM is surgery followed by concurrent chemoradiation and adjuvant temozolomide [2], resulting in a median overall survival (OS) of 8 months [3].Despite the poor prognosis, 30% and 10% of GBM patients achieve an OS of more than 2 and 5 years, respectively [4].Aggressive therapy is required for this subset of patients.Therefore, accurate prognostic assessment is crucial for guiding clinical decision-making.
Several clinical factors have been identified as prognostic indicators, including age, gender, pre-operative Karnofsky performance status (KPS) score, extent of resection and treatment plan [5][6][7].Although molecular biomarkers such as isocitrate dehydrogenase (IDH) mutation and O 6 -methylguanine DNA methyltransferase (MGMT) promoter methylation status, aid in the precise stratification of GBM survival [8], current methods for obtaining such information are always invasive, costly, and vulnerable to sampling errors resulting from GBM's high heterogeneity.Therefore, noninvasive prognostic stratification tools are urgently needed.
Magnetic resonance imaging (MRI), as a noninvasive imaging modality, plays an important role in the diagnosis and prognosis assessment of GBM patients.Radiomics, as an emerging field of research, focuses on developing novel prognostic biomarkers through datadriven analysis of radiologic images [9].However, growing evidence has confirmed that many factors affect the quality of the radiomics models, including image acquisition and normalization, segmentation, feature extraction, and computational statistics [10].Furthermore, understanding the underlying biological rationale behind radiomics features remains challenging [11].While radiomics has primarily academic application, its use in real-world clinical settings is currently limited.
Visually Accessible Rembrandt Images (VASARI) [12], developed for accurate and reproducible glioma assessment based on conventional MRI, can improve clinical management through easier communication of results between radiologists and clinicians.In comparison with functional MRI and radiomics, evaluating VASARI features is simple, requiring no tumor segmentation or complicated post-processing, and its results are highly interpretable, making it an invaluable tool.Previous studies have reported that VASARI features were capable of predicting the OS of GBM patients [13][14][15][16].Although these findings are promising, there is still room for improvement.Firstly, some of the terms used are not intuitive, such as the proportions of enhancing tissue, non-enhancing tissue, necrosis, and edema, which are categorized in six groups: < 5%, 6-33%, 34-67%, 68-95%, < 95%, and 100%.This can lead to low inter-reader agreement and poor reproducibility in the daily reporting system [17].Additionally, no comparative studies were performed with balancing known variables such as age, gender, extent of resection, KPS score, and MGMT methylation status.Moreover, in the context of 5th edition of the WHO central nervous system (CNS) tumor classification, GBM has been redefined, excluding the IDH mutation subtype [18].Therefore, it is necessary to re-evaluate the prognostic value of VASARI features for GBM patients.
In this study, we restricted the study population to IDH1 wild-type GBM.Our aims were threefold: (i) to dichotomize VASARI feature to achieve better repeatability and stability; (ii) to investigate the relationship between dichotomous VASARI feature and the OS using propensity score matching (PSM) to adjust for confounding factors (age, gender, KPS score, extent of resection, therapy, and MGMT_status); and (iii) to develop and validate a model for predicting OS of GBM patients by integrating clinicopathologic variables and VASARI features.

Patients
This retrospective study was approved by the Ethics Committee of Tangdu Hospital (TDLL-20151013) and written informed consent was waived.From January 2016 to June 2022, a total of 464 patients with pathologically confirmed GBM (according to the WHO 2016 classification of CNS tumors) were consecutively collected.
Inclusion criteria were: (i) age ≥ 18 years; (ii) preoperative and postoperative MRI were completed within 72 h, including sequences consisting of T1-weighted imaging (T1WI), T2-weighted imaging (T2WI), fluid attenuated inversion recovery (FLAIR), and contrast-enhanced T1WI (T1CE); (iii) clinicopathological information was well documented, including age, gender, preoperative KPS score, therapy, MGMT promoter methylation status, and IDH 1 mutation status.The exclusion criteria were: (i) IDH1 mutant GBM; (ii) the image quality was unsatisfactory due to susceptibility or motion artifacts; (iii) patient underwent biopsy, radiotherapy, or chemotherapy prior to preoperative MRI; (iv) deaths not related to GBM, such as acute cardiovascular and cerebrovascular events or COVID-19; (v) patients lost to follow-up immediately after discharge from the hospital following operation.Ultimately, 351 patients were enrolled.Figure 1 depicts the patient selection process.

Study design
The study pipeline is illustrated in Fig. 2. Initially, the consistency of VASARI feature extraction was evaluated by four radiologists with varying levels of experience.Next, 21 multi-categorical features were simplified to bicategories.Features with κ value < 0.8 were re-evaluated for consistency.Subsequently, the relationship between dichotomous VASARI features and OS was assessed using the Kaplan-Meier survival curves on the entire cohort.For features with log-rank test P value < 0.05, PSM was used to control for confounders such as age, gender, KPS score, therapy, extent of resection, and MGMT promoter methylation status.Following PSM, the relationship between VASARI features and OS was re-assessed, and sensitivity analyses were conducted to evaluate the robustness of the findings.Finally, a clinical model and a combined model were constructed based on clinical, pathological, and VASARI features to predict the OS of GBM patients.

Outcome and follow-up
The primary outcome of this study was OS, defined as days from the initial surgery to either the death or last

Definition of study variables
Baseline clinical information included age at diagnosis (treated as a continuous variable), gender (male or female), and preoperative KPS score (treated as a continuous variable).Therapy is categorized as either standard of care (SOC) or non-SOC.The standard of care encompasses a comprehensive treatment plan that includes maximal surgical resection followed by the Stupp protocol [2].The neurosurgeon (L.W) and radiologist (L.F.Y) collaborated to determine the extent of resection (EOR), which was measured by comparing changes in tumor volume on MRI before and within 72 h after surgery.Grossly total resection (GTR) was defined as the removal of ≥ 90% of the tumor volume, while removal of < 90% was classified as non-GTR [19].Two neuropathologists (L.L.F. and J.M.), with 10 and 8 years of experience in gliomas diagnosis, respectively, were blinded to patient information and re-evaluated the pathological data.Of these, molecular phenotypes including IDH1/2 mutation status (wild-type or mutant) and MGMT promoter methylation status (methylation or unmethylation), was evaluated using polymerase chain reaction and direct sequencing, respectively.

MRI acquisitions
MRI studies were conducted at our institution using either a 3.0 T or 1.5 T unit from different scanners, with various parameters to reflect real-life inter-center heterogeneity.The brain tumor imaging protocol included T1WI, T2WI, FLAIR, and T1CE.The detailed acquisition parameters are summarized in the Supplementary Table S1.

VASARI feature set: consistency analyses, dichotomization and reanalyses
The image analysis pipeline is shown in step 1 of Fig.    VASARI feature scoring system for gliomas (https://wiki.nci.nih.gov/display/CIP/VASARI) and presented several example cases.It should be noted that feature F17 (diffusion) was not extracted in the present study cohort due to the absence of diffusion-weighted imaging sequences.After a 2-month washout period, reassessment was conducted by senior members of the junior and senior groups.
To improve the consistency of VASARI feature and increase its clinical utility, we simplified the 21 multicategorical VASARI features into dichotomous ones, with criteria detailed in Table 1.Dichotomized VASARI feature is denoted as FnM.For those features with κ value < 0.8 before simplification, consistency analyses were conducted twice, with 3-month washout period, to evaluate the impact of the simplification.

Statistical analysis
All statistical analyses were conducted by Y.Y. and J.H.L using R software (Version 3.5.3).Two-tailed P < 0.05 indicated a significant difference.

Comparison of baseline characteristics
Continuous variables are presented as "mean ± standard deviation (SD)" or "median (interquartile range)" and were compared between groups using either the student t test or Mann-Whitney U test, depending on the normality of the data.Categorical variables are presented as "number (percentage)" and were compared between groups using chi-square or Fisher's exact test.

Consistency analysis for VASARI feature set
The consistency of VASARI features before and after dichotomization was assessed using Cohen Kappa analysis, in which the ordered VASARI features were analyzed using linearly weighted kappa coefficient [20].

Propensity score matching
We employed PSM to mitigate the confounding effects of known predictors for GBM survival, thus enhancing the relevance of the results to real-world clinical practice.Propensity scores were calculated using logistic regression with the following covariates: age, gender, KPS score, EOR, therapy, and MGMT_status.Patients were then selected by 1:1 matching without replacement using the nearest-neighbor method based on their propensity scores.A caliper width of 0.1 standardized differences was used for matching [21,22].To boost the total sample size, other PSM ratios were considered, such as 1:2, 1:3, and 1:4 [21].

Association of VASARI features with overall survival
As shown in step 2 of Fig. 2, Kaplan-Meier analysis with the log-rank test was used to compare the OS of both the before-and after-PSM cohorts.After PSM, we further conducted sensitivity analyses using univariable and multivariable Cox regression to determine whether the association between OS and VASARI feature remained stable.The following variables were adjusted in multivariable Cox regression including: age, gender, KPS score, EOR, therapy, and MGMT_status.

Overall survival prediction model construction and performance evaluation
As shown in step 3 of Fig. 2, all patients were divided into a training set (n = 211) and test set (n = 140) at a ratio of 3:2 using random stratified sampling to ensure balanced patient characteristics.Multivariate Cox proportional hazards analyses with backward stepwise elimination were performed to develop OS prediction models using the training set.The performance of models was quantified with Harrell's concordance probability index (C-index).Areas under the time-dependent ROC curves (AUC) for the 1-year, 2-year, and 3-year follow-ups were calculated and compared between models.Calibration curves and decision curves were plotted to assess the usefulness of the predictive models.The model-averaged importance of the variables was assessed by the best subsets approach [23].

Patient characteristics
A total of 351 patients were enrolled.There were 228 men and 123 women, with a median age of 58 years (range: 51 − 65 years).Table 2 summarizes the patient characteristics of the entire cohort, as well as the training and test cohorts.The median OS of the entire cohort, training cohort and test cohort were 348 days, 320 days, and 363 days, respectively.No significant differences in age, gender, KPS score, therapy, EOR, or MGMT_status was observed between the training and test cohorts (all P > 0.05).

Consistency analyses for VASARI feature set before and after dichotomization
Table 3 presents the results of consistency analyses conducted on the VASARI feature set before and after dichotomization.The results indicate that the senior group outperformed the junior group in the consistency analyses.Before dichotomization, a total of 11 VASARI features with κ value < 0.8 were identified in both the junior and senior groups: F3 (eloquent brain), F5 (proportion enhancing), F6 (proportion nCET), F7 (proportion necrosis), F9 (multifocal or multicentric), F13 (definition of the non-enhancing margin), F14 (proportion of edema), F22 (nCET tumor crosses midline), F26 (extent of resection of enhancing tumor), F27 (extent resection of nCET) and F28 (extent resection of vasogenic edema).After dichotomization, the consistency of these aforementioned features improved, with κ value ranging from 0.874 to 1.000.

Association between VASARI features and overall survival before and after PSM
As shown in Table 4, the Kaplan-Meier curve analyses indicated that 11 VASARI features were associated with OS of GBM patients in the entire cohort.Subsequently, the distribution of confounders (age, gender, KPS, EOR, therapy, and MGMT_status) for these features was balanced between groups using PSM.The baseline characteristics, love plots for standardized mean difference, and Kaplan-Meier curves before and after matching are provided in the Supplementary files (Table S3-S24, Figure S1-S11).After performing PSM, we identified 7 features that were significantly associated with shorter OS.These features are as follows: F2M (center/bilateral), F12M (poorly-defined), F15M (edema crosses midline), F19 (ependymal invasion), F21 (deep white matter invasion), F22M (nCET tumor crosses midline), and F23M (enhancing tumor crosses midline).Notably, the relationship between these features and OS remained consistent even after sensitivity analyses.

Performance evaluation of overall survival prediction models
The clinical and combined models were constructed using the training cohort.According to the modelaveraged importance, F19, EOR, and therapy were the top-ranked variables (Fig. 3).The results of multivariate Cox proportional hazards regression analyses for both models are presented in the Supplementary files (Table S25-S26).
As shown in Table 5, the combined model exhibited superior performance compared to the clinical model.In the training set, the C-index for the combined and clinical model were 0.762 (0.734-0.790), and 0.723 (0.688-0.758), respectively.In the test set, the C-index was 0.812 (0.773-0.851), and 0.702 (0.660-0.744) for the combined and clinical models, respectively.Table 5 also shows the time-dependent discrimination measures for death up to 3 years for the clinical and combined models.Based on training cohort, the combined model achieved better performance than the clinical model for predicting 1-year OS (combined model: AUC, 0.832 vs. clinical model: AUC, 0.786, P = 0.039), 2-year OS (combined model: AUC, 0.809 vs. clinical model: AUC, 0.741, P < 0.001), and 3-year OS (combined model: AUC, 0.853 vs. clinical model: AUC, 0.799, P = 0.021).Similarly, in the test set, the AUC values were higher for the combined model compared to the clinical model, and the difference was statistically significant, except for 1-year OS.
The calibration curves of the two models exhibited a good agreement between predicted and actual observed probabilities for both the training and test cohorts (Fig. 4A-D).Decision curve analysis demonstrated that the combined model provided a higher net benefit for predicting GBM OS compared to the clinical model across the threshold probability range of 0.2-1.0 (Fig. 5).

Discussion
In our study, we simplified certain VASARI features to maximize reproducibility and stability, regardless of the radiologist's level of experience.Then, after controlling for confounders using PSM, seven VASARI features were shown to be associated with shorter OS in GBM patients, namely F2M (center/bilateral), F12M (poorly-defined), F15M (edema crosses midline), F19 (ependymal invasion), F21 (deep white matter invasion), F22M (nCET tumor crosses midline), and F23M (enhancing tumor crosses midline).Ultimately, the combined model, consisting of clinicopathologic variables and VASARI features, demonstrated superior predictive performance for OS compared to the clinical model.The VASARI feature set was chosen for investigation in this study due to two primary considerations.Firstly, the 29 VASARI features extracted exclusively from conventional MRI (T1WI, T2WI, FLAIR and T1CE) are standard sequences for the initial diagnosis of suspected tumor cases.These sequences are routinely accessible in both teaching and non-teaching hospitals.Furthermore, the assessment of MRI semantic features is less susceptible to the influence of acquisition equipment, scanning parameters, and post-processing algorithms compared to functional MRI.Despite the promising potential of emerging radiomics in prognostic evaluation, the VASARI feature set offers better practicality and interpretability for routine applications in radiology and neurosurgery.
Previous studies have confirmed the significant role of VASARI features in predicting genotyping and assessing treatment response and prognosis [15,16,[24][25][26][27].Most of the aforementioned studies were of single-center or single-model MRI data, and VASARI features were assessed by two raters.Any discrepancies were resolved through consultation or by involving a third qualified physician.However, in clinical practice, it is much more common for physicians to assess preoperative MRI from different hospitals independently.Consequently, there is a paucity of research investigating how a radiologist's level of experience impacts the consistency assessment of VASARI features in a multi-model and multi-parameter setting.The present study addresses this gap and found that: (i) The consistency of radiologist's assessment was better in the senior group than in the junior group, suggesting that the VASARI feature extraction was experience-dependent. (ii) The features with κ value of < 0.8 in both the senior and junior groups were mainly multicategorical variables such as necrosis, enhancement, edema, and the percentage of degree of tumor resection, as well as confounding features such as multicentricity or multifocality.Similar to our findings, Kim et al [17] and Chen et al [15] also found that some multicategorical VASARI features are challenging to assess with high agreement, consequently restricting their clinical utility.Therefore, only those features with high stability and reproducibility can provide reliable information for the diagnosis and management of glioma [28].Based on the considerations  mentioned above, therefore, we simplified the VASARI features into dichotomous variables.Although the simplified VASARI lose some detailed information, the consistency of feature assessment is significantly improved, in other words, even when independently assessed by low-seniority radiologist, high intra-and inter-observer consistency can still be obtained.Numerous studies have investigated the association between VASARI features and GBM patient OS, yet a consensus has not been reached.Peeken et al. [26].conducted VASARI feature extraction in 189 patients with GBM.The results of univariate Cox regression analysis indicated an association between OS and 10 VASARI features.These features include F9 (multifocal or multicentric), F24 (satellites), F19 (ependymal invasion), F21 (deep white matter invasion), F13 (definition of the non-enhancing margin), F26 (extent of resection of enhancing tumor), F27 (extent resection of nCET), F14 (proportion of edema), and F29 & F30 (lesion size).After assessing the significance of variables within the multivariate model, the critical features for predicting OS were identified as F9, F21, F24, and F19.Nicolas Jilwan et al. [29].analyzed the MRI of 102 GBMs.In the univariable analysis, F1 (tumor location), F2 (side of tumor epicenter), F5 (proportion enhancing), and F10 (T1/ FLAIR ratio) were related to OS, but after adjusting for clinical variables (chemotherapy) and HRAS copy number, only F5 was an independent predictor for OS.In another retrospective study cohort of 98 adult GBMs, the findings based on Kaplan-Meier analysis demonstrated a significant association between F1 (tumor location), F6 (proportion nCET), F7 (proportion necrosis), and OS.However, in the multivariate Cox regression analysis, only F6 and F7 were identified as independent predictors of OS [16].In contrast to previous studies, in the study by Chen et al. [15].VASARI features were found to be irrelevant to OS after retrospectively analyzing MRI data from 129 cases of GBM.The discrepancies observed in these findings could be due to several factors, including the limited size of the sample, the heterogeneous composition of the study population (comprising both IDH-mutant and wild-type GBM), and the insufficient implementation of rigorous confounding measures.According to the latest edition of the 2021 WHO guidelines for the classification of CNS tumors [18], IDHmutant GBM was excluded.So, the relationship between VASARI features and OS in wild-type GBM need to be further elucidated.Based on the present study cohort, we identified seven MR semantic features that were associated with shorter patient OS after balancing potential confounders using PSM, namely F2M (center/bilateral), F12M (poorly-defined), F15M (edema crosses midline), F19 (ependymal invasion), F21 (deep white matter invasion), F22M (nCET tumor crosses midline), and F23M (enhancing tumor crosses midline), and this correlation persisted after sensitivity analyses.Notably, in terms of OS prediction, F19 is the top-ranked feature in terms of model-averaged importance among the seven features mentioned above.Consistent with our findings, several studies [30][31][32] have demonstrated that patients with subventricular zone involvement have a poor prognosis.Furthermore, our previous study [33] utilizing diffusionweighted imaging and arterial spin labeling imaging to predict the methylation status of the MGMT promoter in IDH wild-type GBM indicated that patients with subventricular zone involvement predominantly exhibited a non-methylated MGMT promoter status and had a dismal prognosis.
There are several limitations in our study.Firstly, our study is a retrospective, single-center study with a small sample size, and the generalizability of its findings needs to be further verified in prospective studies.In addition, our study excluded some cases of all-cause mortality, which may introduce bias.Secondly, this study did not consider the factor of treatment options after tumor recurrence.This was due to the heterogeneity of treatment options available in our center following recurrence.Additionally, a review also highlighted the lack of available treatments to prolong the OS of recurrent GBM [34].Finally, the radiologists involved in VASARI feature extraction were from the same institution, but consisted of individuals with different levels of experience, resembling actual clinical scenarios.

Conclusions
In conclusion, the simplified VASARI features have improved reproducibility and stability regardless of the radiologist's level of experience.After adjusting for confounders, a subset of VASARI features correlates with OS in GBM patients.Ultimately, the combined model constructed by combining clinicopathologic and VASARI features achieved better OS prediction efficacy than the clinical model, but its generalizability needs to be further validated in prospective studies.
outpatient follow-up visit.Patient follow-up data is regularly collected by specialists Y.L.Z and D.J.H through hospital chart reviews and telephone interviews.The last follow-up of patients ended on August 1, 2023.

Fig. 1
Fig. 1 Flow diagram for patient selection

2 .
Before starting image analysis, all pre-and post-operative MRI images were anonymized.VASARI feature extraction was independently performed by four radiologists, consisting of two senior radiologists (Z.C.L and Y.H, with 10, 12 years of experience in brain tumor diagnosis, respectively) and two junior radiologists (J.Z and Y.Y.W, with 4, 6 years of experience in brain tumor diagnosis, respectively).Prior to feature extraction, each radiologist reviewed the VASARI criteria (Supplementary Table S2 ) and then participated in a 1-hour teaching session where the senior neuroradiologist (G.B.C, with 20 years of experience in brain tumor diagnosis) provided a review of

Fig. 3
Fig. 3 Analysis of model-averaged importance of candidate variables

Fig. 4 Fig. 5
Fig. 4 Calibration curves for the clinical model (A, training set; B, test set) and combined model (C, training set; D), test set)

Table 1
Dichotomized VASARI feature set definition and classification criteriaMultifocal is defined as having at least one region of tumor, either enhancing or nonenhancing, which is not contiguous with the dominant lesion and is outside the region of signal abnormality (edema) surrounding the dominant mass.This can be defined as those resulting from dissemination or growth by an established route, spread via commissural or other pathways, or via CSF channels or local metastases, whereas Multicentric are widely separated lesions in different lobes or different hemispheres that cannot be attributed to one of the previously mentioned pathways.Gliomatosis refers to generalized neoplastic transformation of the white matter of most of a hemisphere.The scoring is not applicable if there is no contrast enhancement.If most of the enhancing rim Is thin, regular, and has homogenous enhancement the grade is thin.If most of the rim demonstrates nodular and/or thick enhancement, the grade is thick.If there is only solid enhancement and no rim, the grade is None.
F10MT1/FLAIR RATIO Tumor feature summary.[Mixed,expansive or infiltrative].Expansive = size of pre-contrast T1abnormality (exclusive of signal intensity) approximates size of FLAIR abnormality.Mixed = Size of T1 abnormality moderately less than FLAIR envelope; Infiltrative = Size of pre-contrast T1 abnormality much smaller than size of FLAIR abnormality.(UseT2 if FLAIR is not provided) (Assuming that the the entire abnormality may be comprised of: (1) an enhancing component, (2) a non-enhancing component, (3) a necrotic component and (4) a edema component.).Note: Dichotomized VASARI feature is denoted as FnM

Table 2
Baseline characteristics of the patients

Table 3
Inter-observer and intra-observer consistency analyses of VASARI feature sets before and after dichotomization for junior and senior groups

Table 4
Association between VASARI features and overall survival before and after propensity score matching

Table 5
Comparison of discrimination performance between clinical and combined model in the training and test cohorts